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Abstract 

Uncertainty  quantification  is  essential  in  using  numeri¬ 
cal  models  for  prediction .  While  many  works  focused  on 
how  the  uncertainty  of  the  inputs  propagate  to  the  outputs , 
the  modeling  errors  of  the  numerical  model  were  often  over¬ 
looked.  In  our  Bayesian  framework,  modeling  errors  play 
an  essential  role  and  were  assessed  through  studying  nu¬ 
merical  solution  errors.  The  main  ideas  and  key  concepts 
will  be  illustrated  through  an  oil  reservoir  case  study.  In  this 
study,  inference  on  the  input  has  to  be  made  from  the  output. 
Bayesian  analysis  is  adopted  to  handle  this  inverse  problem, 
then  combine  it  with  the  forward  simulation  for  prediction. 
The  solution  error  models  were  established  based  on  the 
scale-up  solutions  and  fine-grid  solutions.  As  the  central 
piece  of  our  framework,  the  robustness  of  these  error  mod¬ 
els  is  fundamental  In  addition  to  the  oil  reservoir  computer 
codes,  we  will  also  discuss  the  modelling  of  solution  error 
of  shock  wave  physics.  Although  the  framework  itself  is  sim¬ 
ple,  there  is  many  statistical  challenges  which  include  opti¬ 
mal  dimension  of  the  error  model,  trade-off  between  sample 
size  and  the  solution  accuracy.  These  challenges  are  also 
discussed. 


1  Introduction 

One  purpose  of  building  complicated  numerical  models 
is  to  forecast  outcomes  of  complex  systems  such  as  climate 
change,  hydrocarbon  reservoir  production,  contamination 
of  groundwater  and  shock  wave  physics.  However,  a  scien¬ 
tific  prediction  must  be  coupled  with  a  good  uncertainty  as¬ 
sessment.  Recently,  uncertainty  quantification  (UQ)  of  nu¬ 
merical  models  has  drawn  significantly  increased  attention 
from  several  international  scientific  communities  as  well  as 
government  agencies. 

The  uncertainty  of  a  numerical  simulation  comes  from 
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three  different  sources,  the  errors  in  the  physical  model,  the 
errors  in  the  numerical  model,  and  the  uncertainty  on  the 
input  parameters  of  the  model. 

The  uncertainty  on  the  input  parameters  includes  the  re¬ 
sult  of  the  difficulty  in  measuring  corresponding  parameters 
in  the  physical  system.  For  example,  the  permeability  and 
porosity  of  a  petroleum  reservoir  could  not  be  accurately 
measured  with  today’s  technology.  In  numerical  simula¬ 
tion,  uncertainty  of  the  input  parameters  of  a  model  gives 
rise  to  uncertainty  regarding  its  output.  This  propagation 
of  uncertainty  from  input  to  the  output  has  been  the  main 
focus  of  UQ  research.  A  variety  of  sophisticated  statisti¬ 
cal  and  probability  methods  have  been  developed  and  some 
have  been  implemented  into  software  packages[6].  In  addi¬ 
tion,  sensitivity  analysis  has  also  been  applied  to  this  aspect 
of  UQ  inference[18]. 

The  error  of  a  numerical  model  comprise  solution  er¬ 
rors  and  modelling  errors.  The  solution  errors  are  the  result 
of  a  finite  accuracy  approximation  to  the  governing  equa¬ 
tions  describing  continuum  phenomenon.  The  modelling 
errors  could  be  due  to  approximations  in  the  equations  and 
the  physics  they  represent.  Incorrect  choice  of  the  physical 
models  give  rise  to  the  errors  in  the  physics  model.  For  ex¬ 
ample,  a  fault  lies  in  an  oil  reservoir  but  is  not  represented 
in  the  model.  Assessing  both  type  of  modelling  errors  is 
often  referred  to  as  model  validation[5,  14],  which  is  ar¬ 
guably  the  most  important  and  most  challenging  component 
in  UQ.  This  has  to  be  done  through  systematic  comparison 
between  experimental  and  observational  data,  and  compar¬ 
ison  among  competing  models. 

We  have  proposed  a  Bayesian  framework  for  UQ 
inference[12,  13,  4,  8].  Our  focus  is  distinct  from  the  other 
UQ  research  in  several  aspects.  First,  our  approach  is  moti¬ 
vated  by  petroleum  engineering  applications  in  which  im¬ 
portant  parameters  of  the  petroleum  reservoir  are  not  di¬ 
rectly  observable  and  have  to  be  inferred  through  an  inverse 
problem.  Similar  situations  exist  in  many  other  applications 
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including  shock  wave  physics.  Second,  practically  the  in¬ 
verse  problem  can  be  solved  only  with  coarse  grid  simula¬ 
tions  on  a  large  ensemble  sampled  in  the  parameter  space. 
Therefore,  solution  errors  plays  a  big  role  in  the  uncertainty 
quantification. 

We  believe  that  different  UQ  approaches  complement 
each  other  and  each  provides  a  unique  view  on  some  aspects 
of  this  challenging  problem.  Together  they  provide  a  com¬ 
plete  picture.  In  this  paper,  we  will  discuss  first  a  Bayesian 
Framework  for  UQ  inference,  then  three  statistical  aspects 
in  our  UQ  approach: 

1 .  probability  models  of  the  solution  errors,  which  is  at 
the  core  of  our  approach; 

2.  partition  of  uncertainty  to  all  substeps,  which  provide 
guidelines  for  improving  prediction  accuracy; 

3.  efficient  sampling  of  ensembles. 

2  A  Bayesian  UQ  inference  Framework 


Our  Bayesian  UQ  inference  includes  two  equally  impor¬ 
tant  steps.  The  first  is  the  inverse  problem .  In  this  step, 
the  uncertainty  of  the  model  parameters  that  are  not  directly 
observable,  are  reduced  using  other  observations.  This  is 
achieved  by  running  forward  simulations  on  an  ensemble 
sampled  from  the  parameter  space,  then  assign  posterior 
probabilities  to  each  element  of  the  ensemble.  The  better 
a  solution  matched  up  with  the  observations,  the  parame¬ 
ter  setting  used  for  this  solution  receives  higher  probability. 
The  posterior  probability  is  assigned  with  Bayes*  formula, 


P(m\0)  = 


P(Q\m)p(m) 
JM  P(0\m)p(m) 


(1) 


where  M  and  m  represent  the  entire  ensemble  and  one  of 
its  elements.  The  factor  P(G\m)  is  the  likelihood  which 
measures  how  the  observed  data  O  matches  with  the  solu¬ 
tion  of  m.  The  factor  P(m)  is  the  prior  probability  on  the 
ensemble  which  is  often  set  to  be  flat.  However,  the  ensem¬ 
ble  can  be  sampled  from  a  probability  distribution  that  re¬ 
flects  the  current  knowledge  on  the  parameter  space.  Using 
(1),  the  ensemble  is  refined  in  the  sense  that  the  probability 
concentrates  on  a  small  fraction  of  its  elements  and  the  re¬ 
maining  majority  have  only  negligible  probabilities.  Figure 
1  illustrates  this  idea  through  an  oil  reservoir  model  studied 
in  Devolder  et  al.[4].  The  top  of  the  figure  are  500  solutions 
of  the  entire  ensemble  sampled  from  a  random  field  describ¬ 
ing  the  geology  of  an  oil  reservoir.  The  bottem  shows  only 
those  solutions  that  better  match  with  observed  history  data. 
This  inverse  step  is  often  referred  to  as  “calibration”  or  “his¬ 
tory  matching”  by  other  authors[16, 15].  However,  it  differs 
from  “calibration”  of  the  traditional  sense.  Traditional  cal¬ 
ibration  finds  a  single  parameter  setting  that  best  matches 


the  observed  data.  In  this  inverse  step,  the  parameter  space 
is  only  “confined”  by  the  observations  but  never  reduce  to  a 
point  mass.  A  reality  of  this  step  is  that  the  forward  simula¬ 
tion  have  to  be  done  on  relatively  coarse  grids  because  one 
can  not  afford  fine  grid  solutions  on  the  entire  ensemble. 

The  second  step  in  our  UQ  inference  is  the  forward  step , 
in  which  the  numerical  simulation  extends  to  the  future  only 
on  the  refined  ensemble.  Note  that  the  ensemble  can  now 
be  replaced  by  a  subensemble,  i.e.  an  one  that  with  less  el¬ 
ements,  numerical  simulations  can  be  done  on  finer  grids 
than  in  the  inverse  step.  The  posterior  distribution  of  the 
prediction  is  a  product  of  the  probability  distribution  of  the 
ensemble  and  the  probability  distribution  of  the  simulation 
error.  This  posterior  distribution  fully  characterize  the  pre¬ 
diction  and  its  uncertainty,  and  provides  confidence  inter¬ 
vals  for  the  prediction. 

In  the  core  of  our  UQ  inference  framework  is  a  probabil¬ 
ity  model  for  numerical  simulation  errors  which  produces 
the  likelihood  p{0\m)  in  (1).  Without  an  appropriate  un¬ 
derstanding  of  these  errors,  the  inverse  step  will  be  ill  fated. 
One  would  have  no  meaningful  way  to  assess  how  the  ob¬ 
served  data  matches  with  the  numerical  solutions,  ending 
up  with  either  “under  calibrated”  model,  giving  overly  pes¬ 
simistic  uncertainty  assessment,  or  “over  calibrated”  model, 
giving  overly  optimistic  uncertainty  assessment. 

A  similar  Bayesian  approach  has  been  proposed  and  ap¬ 
plied  by  others  to  petroleum  engineering  applications [2, 17, 
15, 1].  In  these  works,  the  numerical  error  is  often  replaced 
by  an  ’’observational”  error,  which  is  then  postulated  with 
little  scientific  basis.  In  the  next  session  we  will  provide  a 
scientific  basis  for  the  error  models  and  likelihoods  being 
used  in  the  analysis. 


3  Probability  Model  of  Solution  Errors 

The  inverse  step  in  our  Bayesian  UQ  inference  relies  on 
a  probability  model  to  calculate  the  likelihood  p{0\m)  in 
(1).  Ideally,  this  probability  model  should  account  for  the 
solution  errors,  the  modelling  errors  and  the  observational 
errors.  Among  the  three,  modelling  error  is  the  most  dif¬ 
ficult  to  be  assessed.  One  needs  to  compare  results  from 
physical  experiments  with  numerical  solutions,  which  can 
only  be  done  under  limited  controllable  experimental  con¬ 
ditions  and  are  very  expensive[5].  A  number  of  authors  has 
proposed  and  applied  a  similar  Bayesian  inference  frame¬ 
work  to  quantify  uncertainties  of  numerical  simulation  pre¬ 
dictions  [15,  17,  2,  16,  3,  1].  Although  the  importance  of 
such  a  probability  error  model  are  fully  recognized,  they  ei¬ 
ther  use  subjective  models  “elicited  by  experts”[3],  or  disre¬ 
gard  it  completely  but  focus  on  statistical  modelling  of  the 
output  [16].  In  general,  the  likelihood  p(0\m)  is  obtained 
using  postulated  error  models  without  further  justification. 


PV1 


Figure  1.  A  full  ensemble  and  a  refined  en¬ 
semble  after  “history  matching” 


Our  approach  to  build  such  a  probability  error  model  is 
based  on  observation  errors  and  solution  errors.  The  latter 
can  be  observed  by  comparing  the  numerical  solutions  to 
more  accurate  numerical  solutions.  We  think  that  in  many 
applications  it  is  the  only  practical  way  to  make  physically 
meaningful  and  objective  assessment  of  the  simulation  er¬ 
ror.  Theoretically,  the  simulation  error  can  be  evaluated  by 
comparing  the  simulation  results  with  observed  data.  How¬ 
ever,  this  is  only  possible  when  the  values  of  all  physical 
parameters  describing  the  real  system  are  well  known.  For 
complex  systems,  such  cases  rarely  exist.  For  example, 
obtaining  the  exact  geology  parameters  of  an  oil  reservoir 
is  practically  impossible.  Additional,  in  our  UQ  inference 
framework,  a  large  ensemble  from  the  parameter  space  is 
sampled  for  the  inverse  problem,  and  numerical  solutions 
are  obtained  for  each  parameter  settings  in  the  ensemble. 
For  complex  systems,  the  simulations  on  the  entire  ensem¬ 
ble  has  to  be  done  on  coarse  grid  due  to  constrains  on  the 
computational  resource.  The  use  of  the  coarse  grid  solu¬ 
tion  will  make  the  solution  errors  account  for  a  large  pro¬ 


portion  of  the  total  simulation  error.  This  situation  will 
remain  .true  even  as  the  computing  power  improves  in  the 
future,  because  the  complexity  of  future  numerical  model 
will  increase  in  the  same  pace  as  the  computing  power  al¬ 
lows.  However,  one  can  afford  to  do  both  the  coarse  grid 
and  the  fine  grid  solutions  on  a  small  sample  of  the  param¬ 
eter  spaces  to  evaluate  the  solution  errors. 

Since  fine  grid  solutions  are  resource  consuming,  the  er¬ 
ror  models  have  to  be  constructed  from  a  sample  with  lim¬ 
ited  size.  This  has  two  implications.  First,  although  the  out¬ 
put  is  often  high  dimensional,  error  models  must  be  simple 
and  effective  with  low  dimensionality.  Second,  uncertainty 
due  to  the  finite  sample  size  has  to  be  assessed.  Both  issues 
have  to  be  thoroughly  studied.  In  our  previous  work[9],  the 
effect  of  finite  sample  size  can  be  handled  by  the  Wishart 
distribution  for  Gaussian  error  models.  However,  the  com¬ 
plexity  of  the  error  models  is  determined  in  ad  hoc  manners. 
Two  questions  should  be  answered: 

•  What  is  the  optimal  choice  of  the  error  model  dimen¬ 
sionality  and  how  can  the  dimension  reduction  to  be 
achieved? 

•  Is  the  Gaussian  error  models  robust  when  the  Gaussian 
assumption  is  violated? 

Regarding  the  second  question,  the  effect  of  finite  sample 
size  could  be  studied  using  re-sampling  methods,  when  the 
error  is  non-Gaussian. 

4  Transferable  Error  Models  in  Shock  Wave 
Physics 

As  discussed  above,  brutal  force  computing  is  needed  to 
establish  quality  error  models.  Such  brutal  force  may  not  al¬ 
ways  available  for  the  most  complicated  problem  and  may 
not  allowed  in  situations  that  require  quick  analysis.  There¬ 
fore,  it  is  very  important  to  study  the  transferability  of  er¬ 
ror  models  from  relatively  elementary  problems  to  complex 
problems.  The  errors  in  a  complex  system  can  be  viewed 
as  composition  of  errors  arising  from  repeated,  but  elemen¬ 
tary  processes.  One  of  our  ideas  is  to  fist  develop  the  error 
models  in  a  simple  context  then  transferred  to  a  related  but 
more  complex  context.  Because  the  simple  models  will  be 
transferred  across  a  series  of  related  contexts,  they  can  be 
used  without  reparameterization  once  verified.  This  idea, 
however,  must  be  developed,  refined  and  validated  before  it 
can  be  used. 

We  have  test  this  idea  in  ID  shock  wave  physics 
models[7].  Shock  waves  are  localized  structures,  and  lie 
on  surfaces  in  3D.  The  interactions,  i.e.  crossing  with  one 
another  or  with  fluid  or  material  surfaces  occur  on  lines,  and 
the  bifurcations,  or  modification  of  the  interaction  structures 
occur  at  isolated  points  in  space,  moving  along  curves  in 


space-time.  These  idealized  structures  are  solved  by  shock 
Hugoniots  and  their  interactions  are  given  by  solutions  of 
Riemann  problems  and  shock  polars.  Since  Riemann  prob¬ 
lems  and  shock  polars  are  basic  ingredients  in  more  com¬ 
plex  shock  wave  physics  problems,  their  errors  can  be  used 
to  build  error  models  of  those  complex  systems.  Several 
idealized  shock  wave  interaction  problems  were  studied. 

We  start  with  a  statistical  approximation  of  the  error  in 
a  given  Riemann  problem  Rq  using  multinomial  expansion 
associated  with  initial  waves  and  errors  located  inside  its 
domain  of  dependence.  For  a  ID  shock  wave  interaction 
problem,  think  of  the  solution  as  being  primarily  composed 
of  localized  waves,  interacting  through  Riemann  problems 
and  generating  outgoing  waves,  that  further  interact  in  the 
same  manner.  Each  wave  w  is  described  by  a  vector  vw  that 
records  its  strength,  location  in  state  space,  speed  and  start¬ 
ing  location  and  time,  and  the  errors  or  uncertainty  associ¬ 
ated  with  these  quantities.  The  interaction  of  waves  gen¬ 
erates  a  planar  (ID  space  and  time)  graph,  the  vertices  of 
which  are  the  Riemann  problems  and  the  bonds  are  the  trav¬ 
eling  waves,  between  Riemann  problem  interactions.  Start¬ 
ing  from  a  given  Riemann  problem  (vertex)  or  wave  (bond), 
we  can  trace  backward  and  determine  its  domain  of  depen¬ 
dence.  Call  this  graph  Q. 

For  each  Riemann  problem,  we  consider  three  types  of 
vertices,  corresponding  to  the  constant,  linear  and  bilinear 
terms  in  the  multinomial  approximation  of  solution  and  er¬ 
ror  terms.  See  [7]  for  details.  The  linear  terms  allow  a  sim¬ 
ple  propagation  law, 

Sl  =  J  w(t  —  0)du,  (2) 

where  w(t  =  0)  is  a  vector  representing  the  strength  of  the 
time  zero  wave  and  its  error  or  uncertainty,  evaluated  at  the 
beginning  of  the  path  w,  and  SL  is  the  purely  linear  prop¬ 
agation  contribution  to  a  final  time  error.  The  path  space 
integral  du  is  taken  over  all  paths  progressing  in  time  order 
through  Q  from  the  initial  time  to  the  final  vertex,  with  each 
term  weighted  by  the  appropriate  linear  factors  from  the  for¬ 
mula  for  the  approximate  solution  of  the  Riemann  problems 
transversed.  This  path  space  representation  makes  evident 
the  point  that  the  solution  SL  is  that  of  a  multiple  (linear) 
scattering  problem. 

The  amplitude  S  at  the  final  time  (vertex  of  Q)  can  sim¬ 
ilarly  be  thought  of  as  a  solution  of  a  nonlinear  multiple 
scattering  problem,  leading  to  a  representation  in  terms  of 
multipath  integrals.  Let  V  =  V(G)  be  the  set  of  vertices  of 
g 9  and  let  B  c  V  be  a  subset  of  V  where  constant  or  bilinear 
terms  occur.  The  total  amplitude  S  will  then  be  a  sum  over 
terms  S&  indexed  by  B.  For  each  v  e  B,  let  lv  be  the  the 
interaction  coefficient  obtained  for  basic  riemann  problems. 


Figure  2.  The  solution  and  its  errors  at  the 
point  ( x,t )  can  be  obtained  by  “adding  up” 
the  solution  and  errors  for  the  waves  within 
the  domain  of  dependence 


We  write 

5=  £  sB=  y,  /n1^  • 

BCV{G)  BcV{G)J  v€B 

Here  dtOQ  is  a  multipath  integral  over  all  multipaths.  The 
multipath  propagator  duos  is  a  product  of  the  individual 
propagators  u  for  each  single  path,  as  in  (2).  The  summa¬ 
tion  in  (3)  can  be  understood  schematically  as  the  sum  over 
all  events  within  the  domain  of  dependence  of  the  evalua¬ 
tion  point  (x,  £)  at  the  vertex  of  Q.  See  Fig.  2. 

The  composition  law  (3)  has  been  validated  on  several 
idealized  composite  interaction  problems.  It  shows  some 
promise  of  developing  transferable  error  model  for  less  ide¬ 
alized  problems.  More  detailed  modelling  on  the  errors 
need  to  be  done  for  more  accurate  error  models. 

5  Decomposition  of  the  Uncertainty 

Once  a  framework  to  quantify  the  uncertainty  of  predic¬ 
tion  has  been  established,  it  is  important  to  know  the  con¬ 
tribution  of  each  component  in  the  prediction  uncertainty. 
Knowing  the  relative  importance  of  each  components  will 
help  us  in  improving  the  accuracy  of  a  prediction. 

The  uncertainty  of  a  prediction  under  our  Bayesian 
framework  arises  from  the  following  sources: 

•  forward  simulation  errors  in  the  forward  step 

•  uncertainty  of  the  geology 

•  the  sparsity  of  the  observed  real  data 

•  forward  simulation  error  in  the  inverse  problem 


•  limited  size  of  the  ensemble  in  the  inverse  problem 

•  inadequacy  of  the  error  model  resulted  from  limited 
sample  size  in  the  inverse  problem 

A  breakup  of  total  uncertainty  to  individual  sources  will 
provide  guidelines  for  optimal  distribution  of  resources  to 
improve  prediction  accuracy.  For  example,  should  addi¬ 
tional  research  effort  be  spend  to  develop  more  accurate  nu¬ 
merical  models  or  more  advanced  measuring  systems  for 
better  observations;  should  the  inverse  problem  be  done  us¬ 
ing  a  large  ensemble  with  coarse  solutions  or  a  smaller  en¬ 
semble  with  finer  solutions. 

We  have  developed  some  statistical  methods  for  parti¬ 
tion  of  prediction  uncertainty  with  an  idealized  oil  reser¬ 
voir  modelfl  1].  The  idealized  Darcy  and  Buckley-Leverett 
equations 

v  =  -KAVp;  V-v  =  0 

st  +  v  ■  V/  =  0 

are  solved  for  a  total  seepage  velocity  v  and  oil  saturation 
s.  Here  K  =  K(x,  z)  is  the  random  total  permeability,  A 
the  relative  transmissivity  and  /  the  fractional  flux.  See  also 
[8,  10]  for  a  more  detailed  specification  of  the  simulations. 
We  model  the  real  problem  by  selecting  a  particular  geol¬ 
ogy  Kio  as  the  “correct”  one.  We  assume  that  the  actual 
geology  is  not  observed  and  only  the  oil  cut  (oil  to  water 
ratio)  s(t)  until  present  time  T0  is  observed.  The  goal  is 
to  predict  the  future  oil  production  /^nal  time  s(t)dt .  To  ap¬ 
ply  our  Bayesian  UQ  framework,  simulations  are  performed 
on  an  ensemble  with  varied  viscosity  ratio  v  and  perme¬ 
ability  fields  K  to  observe  the  oil  cut  Si{t),  as  shown  in 
the  top  graph  of  Figure  1 .  The  solution  error  models  are 
based  on  difference  of  arrival  time  between  the  fine  grid  so¬ 
lution  and  coarse  grid  solution.  Its  degree  of  freedom  are 
reduced  to  five,  the  breakthrough  time  and  the  incremental 
elapsed  time  at  oil  cut  levels  of  0.8,  0.6,  0.4,  and  0.2.  That 
is  A (U)  =  t(Si)  -  t(Si- 1),  where 

t{Si)  =  sup (s(£)  >  Si },  (4) 

t 

and  Si  =  1  —  0.2  -  Z,  0  <  /  <  AT,  and  A(S0)  =  t(S0).  Thus, 
the  errors  to  be  modelled  are  e(Si)  =  A /(Si)  -  AC(S*), 
where  A /  and  Ac  represents  the  fine  and  coarse  grid  solu¬ 
tion. 

Here  we  briefly  describe  a  method  to  partition  the  total 
uncertainty  into  four  different  components.  For  details,  see 
[11].  First,  we  approximate  <r2e0,  uncertainty  inherited  from 
uncertainty  of  the  geology  and  due  to  insufficient  observed 
data,  by  the  following  method.  The  posterior  is  determined 
by  fine  grid  solutions,  using  the  windowing  method,  while 
the  future  is  also  simulated  using  the  fine  grid.  We  aver¬ 
age  the  prediction  errors  over  an  ensemble  to  estimate  cr2^. 


Then  we  obtain  an  estimate  of  ofnv  by  RMS  difference  be¬ 
tween  cTgeo  and. the  prediction  errors  of  using  coarse  grid 
solution  for  the  inverse  step  and  fine  grid  for  the  forward 
step .  Subsequently,  we  obtain  an  estimate  of  <7^d  by  RMS 
difference  between  cr2eo  +  ofnv  and  the  prediction  errors 
of  using  coarse  grid  solution  for  both  inverse  and  forward 
steps.  Finally,  we  obtain  cr2tat  by  RMS  difference  between 
(jgeo  -f  <7i2nv  4-  ofwd  and  the  prediction  errors  of  using  coarse 
grid  solution  for  both  steps  and  a  statistical  approximation 
of  the  error  model. 

We  found  that  (j^eo  dominates  the  other  three  compo¬ 
nents  and  cr2tat  contributes  a  little.  This  suggests  that  the 
inverse  step  does  not  sufficiently  reduce  the  uncertainty  on 
the  geology  because  of  the  sparsity  of  the  observed  data. 
The  observed  data  does  not  contain  enough  information  for 
the  inverse  inference  on  the  parameter  space.  Therefore, 
within  the  model  assumptions  used  in  the  present  study,  ef¬ 
forts  should  be  made  to  either  gather  more  on  the  geology 
itself  or  collect  other  information  that  lead  to  better  infer¬ 
ence  on  the  geology,  if  one  would  like  to  significantly  im¬ 
prove  the  accuracy  in  prediction.  On  the  other  hand,  a  more 
adequate  probability  error  models  does  not  result  much  im¬ 
provement  on  the  prediction  but  it  is  the  cheapest  to  do.  We 
also  found  that  solution  errors  in  the  forward  step  contribu¬ 
tion  more  than  the  solution  errors  in  the  inverse  step.  This 
suggests  use  fine  grid  simulation  in  the  forward  step  and 
coarse  grid  solution  in  the  inverse  step. 

Although  we  developed  the  method  on  an  oil  reservoir 
model,  they  should  be  applicable  to  any  applications  that  fit 
into  our  UQ  inference  frameworks.  Of  course,  the  relative 
importance  of  each  components  will  not  remain  the  same. 
Methods  to  further  decompose  cr2eo  and  ofnv  will  be  devel¬ 
oped  in  the  future.  Formal  statistical  inferences  will  also 
be  developed  for  estimation  of  each  components.  Based  on 
the  reliable  methods  for  partition,  one  could  study  the  nec¬ 
essary  size  and  solution  accuracy  for  the  ensemble,  and  the 
necessary  complexity  and  sample  size  for  the  error  model. 

6  Sampling  Methods  of  Ensembles 

In  the  inverse  step  of  our  UQ  inference  framework,  an 
ensemble  is  sampled  from  the  parameter  space  and  their  so¬ 
lutions  are  matched  with  the  observed  data.  The  quality  of 
the  ensemble  is  an  important  factor  in  the  final  prediction 
uncertainty.  If  all  elements  in  the  ensemble  are  far  from 
the  real  parameters  of  the  physical  system,  then  the  predic¬ 
tion  is  doomed  to  be  poor.  However,  given  a  fixed  amount 
of  computational  resources,  deeper  sampling  in  the  param¬ 
eter  space  is  only  possible  with  solutions  on  coarser  grid.  A 
challenge  in  our  UQ  framework  is  to  find  the  delicate  bal¬ 
ance  between  the  size  of  the  ensemble  and  the  quality  of  the 
solutions. 

In  our  previous  works,  the  ensembles  are  obtained  us- 


ing  simple  random  sampling  methods  from  a  random  field. 
It  is  conceptually  simple  but  may  not  be  the  most  effi¬ 
cient.  Other  sampling  methods  and  their  combinations  de¬ 
serve  further  investigation.  Latin  hypercube  sampling  gen¬ 
erates  ensembles  that  spread  more  evenly  in  the  parame¬ 
ter  space.  Experimental  design  methods  could  also  be  used 
to  refine  the  parameter  space  for  more  effective  sampling, 
which  might  be  necessary  for  large  system  with  a  lot  of  pa¬ 
rameters.  In  real  application,  some  directly  information  on 
the  parameter  space  might  be  obtained.  For  example,  some 
rock  sample  are  obtained  for  an  oil  reservoir.  In  those  cases, 
sampling  methods  that  obtain  samples  consistent  with  those 
information  need  to  be  developed.  Another  alternative  is 
to  directly  sample  from  the  posterior  using  Markov  Chain 
Monte  Carlo  method,  as  proposed  by  Oliver  et  al.[17,  2]. 
However,  using  MCMC  methods  alone  might  be  too  slow 
for  large  problems.  But  all  of  these  choices  need  to  be  care¬ 
fully  compared  and  anyone  of  them  might  be  more  efficient 
than  others  in  some  situations. 

7  Conclusion 

In  this  paper,  we  presented  a  Bayesian  UQ  framework 
for  predictions  using  numerical  models.  What  is  essential 
to  this  framework  is  a  probability  model  for  the  simulation 
errors.  Our  approach  is  to  formulate  the  model  based  on 
solution  errors.  An  important  idea  for  build  a  robust  er¬ 
ror  model  is  transferrability,  which  allows  us  to  build  er¬ 
ror  model  of  a  complex  problem  from  the  error  models  of 
its  simpler  components.  Another  challenge  for  error  mod¬ 
els,  which  are  build  based  on  finite  samples,  is  to  determine 
its  dimensionality.  Methods  to  partition  the  the  total  un¬ 
certainty  into  several  components  is  also  important.  They 
provides  foundations  for  optimal  allocation  of  the  resources 
to  reduce  the  prediction  uncertainty.  Many  works  need  to 
be  done  in  these  areas. 
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